Evaluate with Confidence Estimation: Machine ranking of translation outputs using grammatical features
نویسندگان
چکیده
We present a pilot study on an evaluation method which is able to rank translation outputs with no reference translation, given only their source sentence. The system employs a statistical classifier trained upon existing human rankings, using several features derived from analysis of both the source and the target sentences. Development experiments on one language pair showed that the method has considerably good correlation with human ranking when using features obtained from a PCFG parser.
منابع مشابه
Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors
In this paper, we present two dependency parser training methods appropriate for parsing outputs of statistical machine translation (SMT), which pose problems to standard parsers due to their frequent ungrammaticality. We adapt the MST parser by exploiting additional features from the source language, and by introducing artificial grammatical errors in the parser training data, so that the trai...
متن کاملMachine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output
This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with blackbox features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new el...
متن کاملSelecting Feature Sets for Comparative and Time-Oriented Quality Estimation of Machine Translation Output
This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with blackbox features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new el...
متن کاملAutomatic Projection of Semantic Structures: an Application to Pairwise Translation Ranking
We present a model for the inclusion of semantic role annotations in the framework of confidence estimation for machine translation. The model has several interesting properties, most notably: 1) it only requires a linguistic processor on the (generally well-formed) source side of the translation; 2) it does not directly rely on properties of the translation model (hence, it can be applied beyo...
متن کاملQuality Estimation Of Machine Translation Outputs Through Stemming
Machine Translation is the challenging problem for Indian languages. Every day we can see some machine translators being developed , but getting a high quality automatic translation is still a very distant dream . The correct translated sentence for Hindi language is rarely found. In this paper, we are emphasizing on English-Hindi language pair, so in order to preserve the correct MT output we ...
متن کامل